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(57) ABSTRACT 


The present invention is a method, system, and computer 
program product for implementation of a capable, general 
purpose compression algorithm that can be engaged “on the 
fly”. This invention has particular practical application with 
time-series data, and more particularly, time-series data 
obtained form a spacecraft, or similar situations where cost, 
size and/or power limitations are prevalent, although it is not 
limited to such applications. It is also particularly applicable 
to the compression of serial data streams and works in one, 
two, or three dimensions. The original input data is approxi- 
mated by Chebyshev polynomials, achieving very high 
compression ratios on serial data streams with minimal loss 
of scientific information. 
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DATA COMPRESSION USING CHEBYSHEV 
TRANSFORM 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

This application claims the benefit of prior filed U.S. 
provisional Application No. 60/400,326, filed on Aug. 1, 
2002, the complete disclosure of which is incorporated fully 
herein by reference. 

STATEMENT OF GOVERNMENTAL INTEREST 

This invention was made with Government support under 
Contract No. NAG5-8688 awarded by the National Aero- 
nautics and Space Administration (NASA). The Govern- 
ment has certain rights in the invention. 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

This invention relates to the field of data compression. 

2. Description of the Related Art 

With the explosion of the digital age, there has been a 
tremendous increase in the amount of data being transmitted 
from one point to another. Data may be traveling within an 
office, within a country, from one country to another, from 
Earth to locations in outer space, or from locations in outer 
space back to Earth. 

Increasingly capable instruments and ever more ambi- 
tious scientific objectives produce ever-greater data flows to 
scientists, and this is particularly true for scientific missions 
involving spacecraft. A large variety of compression algo- 
rithms have been developed for imaging data, yet little 
attention has been given to the large amount of data acquired 
by many other types of instruments. In particular, radar 
sounders, radar synthetic aperture mappers, mass spectrom- 
eters, and other such instruments have become increasingly 
important to space missions. Although the volume of sci- 
entific data obtained has grown with the increasing sophis- 
tication of the instruments used to obtain the data, spacecraft 
capabilities for telecommunications bandwidth have not 
grown at the same rate. The tightening constraints on 
spacecraft cost, mass, power, and size, limit the amount of 
resources that can be devoted to relaying the science data 
from the spacecraft to the ground. Competition for use of 
ground station resources, such as the NASA Deep Space 
Network, further limit the number of bits that can be 
transmitted to Earth. 

One approach to increase the “scientific return” in the face 
of these constraints is to use data compression, as has been 
adopted for many NASA scientific missions. For example, 
the Galileo mission to explore the planet Jupiter and its 
moons has made extensive use of lossy image compression 
methods, such as the discrete cosine transform, after the high 
gain antenna of the Galileo spacecraft failed to deploy 
properly. By compressing the data, the Galileo team was 
able to capture data using the spacecraft’s smaller, properly 
functioning antenna. 

Other missions, like NEAR, make routine use of both 
lossless and lossy image compression to reduce data vol- 
ume, employing several different algorithms. In both the 
NEAR and Galileo programs, scientists felt that the inevi- 
table loss of information associated with data compression 
and decompression was more than compensated by the 
opportunity to return more measurements, that is, there is net 
scientific gain when more measurements are returned (or 
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higher temporal/spatial/spectral resolution is achieved), 
even with loss of fidelity of data returned. 

Standard image compression methods like discrete cosine 
transforms and related methods are optimized for image data 
5 and are not easily adaptable to the data streams from 
non-image sources (e.g., a spectrometer) or to time series 
data sources (those with a time component, such as video), 
and their performance characteristics (in terms of what 
information is lost by compression) are not necessarily 
to optimal for such time series data. One reason for this is that 
image compression methods take advantage of 2 -dimen- 
sional spatial correlations generally present in images, but 
such correlations are absent or qualitatively different in 
time- series data, such as data from a spectrometer or par- 
15 ticle/photon counter. However, the need for compression of 
non-image data is growing and will continue to grow in the 
future. For example, hyper- spectral images from a scanning 
spectrograph are particularly high bandwidth but not suited 
for compression by standard techniques. Further, lossless 
20 compression methods such as Huffman encoding, run-length 
encoding, and Fast and Rico algorithms, and lossy methods 
such as straight quantization, provide relatively small com- 
pression rates. Thus, it would be desirable to significantly 
increase the time resolution of such an instrument within the 
25 bandwidth allocation currently available, and increase the 
compression ratio available when compressing this data, 
while still retaining the scientific value of the compressed 
data, and while being able to use the same compression 
method for single or multi-dimensional applications. 

30 

SUMMARY OF THE INVENTION 

The present invention is a method, system, and computer 
program product for implementation of capable, general 
35 purpose compression that can be engaged “on the fly”. This 
invention is applicable to the compression of any data type, 
including time-series data, and has particular practical appli- 
cation on board spacecraft, or similar situations where cost, 
size and/or power limitations are prevalent, although it is not 
40 limited to such applications. It is also particularly applicable 
to the compression of serial data streams and works in one 
dimension for time-series data; in two dimensions for image 
data; and in three dimensions for image cube data. The 
original input data is approximated by Chebyshev polyno- 
45 mials, achieving very high compression ratios on serial data 
streams with minimal loss of scientific information. 

BRIEF DESCRIPTION OF THE DRAWINGS 

50 FIG. 1 is a flowchart illustrating the basic steps performed 
in accordance with the present invention; 

FIG. 2 is a block diagram of the steps of FIG. 1, 
containing a small sample of simulated time-series data; and 

FIG. 3 illustrates the result of applying the Chebyshev 
55 transform of a sample data set. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENT (S) 

60 An embodiment of the present invention is described 
herein with reference to compression of data obtained by a 
spacecraft, however, it is understood that the claims of the 
present invention are intended to cover data compression of 
all kinds of data, for example, medical imaging data, data 
65 acquisition/logging applications, and the like. From a tech- 
nical standpoint, a general purpose compression method for 
use on-board spacecraft should have the following proper- 
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ties: low computational burden, particularly in the compres- 
sion step that occurs on-board the spacecraft where there is 
limited power and CPU cycles; flexibility to work in small 
data blocks without requiring any specific data block size; 
minimal demands on buffer memory to perform compres- 5 
sion of real time data; compression and decompression of a 
data block completely independent of that in other blocks so 
there is minimum loss of information in the event of data 
dropouts or corruption; and high quantitative fidelity of 
decompressed data to the original data stream as measured 10 
by scientific information content. Use of Chebyshev poly- 
nomials as described herein meets all of these criteria. 

FIG. 1 is a flowchart illustrating the basic steps performed 
in accordance with the present invention and FIG. 2 is a 
block diagram of the same steps, containing a small sample 
of simulated time- series data. At step 100, the original input 
data is divided into blocks to form a matrix of a predeter- 
mined size. For one-dimensional data, the size of each 
matrix will be Nxl (i.e., they will have a single “depth” 2Q 
dimension), while for two-dimensional applications, it is 
convenient, though not required to have the data matrices be 
square, i.e., NxN blocks and for three-dimensional applica- 
tions it is convenient to have the data matrix be a code, i.e., 
NxNxN blocks. The optimal block size (the value of N) is 25 
a compromise among several factors and is not necessarily 
related to the bit depth. The “best” matrix size for a given 
application depends on the available computational 
resources and the nature of the data (i.e., what degree and 
types of information loss can be tolerated). A larger matrix 3Q 
size can give higher compression performance but will 
require more computation. Applicant has found N=8 and 
N=16 to be common acceptable choices though other 
choices are also acceptable. 

For applications of the method in two or more dimen- 35 
sions, the original dataset to be compressed actually consists 
of “frames” that are sampled at a sequence of times. Each 
frame is an array of data values, in one or more dimensions. 

An example of a one-dimensional array making up a data 
frame would be the data from a line-scan imager (also called 40 
a whisk-broom imager), where the dimension in the data 
array is spatial. Another such example would be the data 
from a multichannel particle analyzer, where the dimension 
in the data array could be particle energy. Successive frames 
(a total of N) would be buffered and interleaved to produce 45 
NxM blocks, in which the first dimension is temporal and 
the other is that corresponding to the data frame of M 
samples. A second class of applications would be to datasets 
where each frame is a two dimensional array, such as for 
video imaging (two spatial dimensions) or for spectral 50 
imaging (one dimension spatial and one spectral). To apply 
the two-dimensional compression to this second class of 
datasets, successive frames also need to be buffered and 
interleaved, so that NxN blocks, for instance, are formed 
with time as one dimension. The other dimension is chosen 55 
by the user and is, for best performance, the dimension in 
which the data frames have greater redundancy of informa- 
tion. Since the application illustrated in FIG. 2 is a one- 
dimensional application, each matrix size is 16x1. As can be 
seen on the left side of FIG. 2, 32 data points are illustrated 60 
and they have each been divided into blocks (block Bl, 
block B2) of 16 data points each. 

At step 102, the Chebyshev transform is applied to a first 
data block (block Bl in FIG. 2). This results in, in this 
example, 1 6 Chebyshev coefficients for the data in block Bl . 65 
The result of applying the Chebyshev transform to block Bl 
are shown in the matrix illustrated in FIG. 3. 
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At step 104, for each coefficient in the matrix (the matrix 
currently being processed), thresholding is performed. This 
is best illustrated in FIG. 2, where the results of the Che- 
byshev transform for each block are plotted on graphs G1 
and G2. In the example of FIG. 2, a threshold of -10 to +10 
has been established, as illustrated by the threshold lines T1 
and T2 at these two points (on each graph). In accordance 
with the present invention, the thresholding step 104 
involves the retention of coefficients larger than the given 
threshold (e.g., in this example, above 10 or below -10). 
Thus, data points 1, 2, 3, and 5 are retained and the 
coefficients for data points 4 and 6-16 are discarded. 

At step 106, the retained coefficients (those retained after 
the thresholding process) are quantized. The quantizing is 
performed because the retained values can be large numbers 
that would require significant amounts of memory to store; 
if a floating-point processor is used, each number could be 
as large as 64 bits if stored as double-precision. By quan- 
tizing the retained values they can be reduced in size to as 
few as 8 or even 6 bits, depending upon how much com- 
pression is required and how much loss can be tolerated. 

In the example of FIG. 2, the quantization is performed by 
mapping the amplitudes of the retained coefficients as fol- 
lows. 

The basic Chebyshev algorithm applies a quantizer with 
a fixed step size (uniform quantizer) to the coefficients 
retained after thresholding. For each block, the maximum 
and minimum coefficient values, c max and c min , are stored 
and used to determine the quantizer step size. The quantized 
coefficients Q(i) are calculated as follows: 


(c(i) — C min ) 

<2(0 = (2 m - 1)---^ for /=1,..., N 

(Cmax -Cmin) 


where m is the number of bits to which the coefficients are 
quantized, N is the block size, and c(i) is the \ th retained 
coefficient in the block. The distortion introduced by uni- 
form quantization can be measured by setting the threshold 
to zero in the Chebyshev algorithm, forcing the algorithm to 
retain and quantize all coefficients. 

Basically, for each plot, the largest and smallest coeffi- 
cient amplitude is kept and is used to linearly map the other 
amplitudes into the range 0 to 2 8 (0 to 255) where 8 is the 
number of bits being used to store the coefficients. It is noted 
that the number of bits can be any number, and the larger the 
number, the more accurate the reconstructed signal will be, 
but the lower will be the compression ratio. 

The Chebyshev coefficients tend to quickly approach zero 
as j (see equation 5 below) increases. The distribution of 
coefficient values therefore has a higher mass near zero, and 
approximating the coefficients better in this high probability 
region will reduce quantization distortion. This can be 
achieved by expanding the quantization intervals near zero 
using a compander function, then uniformly quantizing. A 
compander compresses dynamic range on encode and 
expands dynamic range on decode. Ideally the Chebyshev 
coefficient compander will stretch values near zero while 
compressing values near the coefficient maximum and mini- 
mum. 

Several compander functions will work to expand the 
quantization intervals near zero, including logarithmic and 
trigonometric functions, as well as the Mu-law and A-law 
companders generally used for reducing dynamic range in 
audio signals before quantization. The inverse hyperbolic 
sine function was found by the applicant to perform par- 
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ticularly well in expanding Chebyshev coefficients near the 
origin and compressing coefficients away from the origin. 

If the high frequency coefficients cluster near zero, the 
compander almost entirely smoothes out the high frequency 
noise. But along with the advantage of reducing high fre- 5 
quency distortion comes the disadvantage of an often poorly 
represented DC coefficient. This problem is easily solved by 
storing the DC coefficient separately and applying the com- 
pander only to the AC coefficients. 

Other known techniques for quantization can be per- 10 
formed, for example, floating point quantization. An 
example using floating-point quantization is discussed in 
more detail below in connection with a two-dimensional 
example. 

At step 108, a bit control word is created so that the 15 
retained data points can be inserted at the appropriate 
location in the data signal when it is decompressed, and so 
that place holders (e.g., zeroes) can be inserted in the 
appropriate locations where data points have not been 
retained. In the one-dimensional application illustrated in 20 
FIG. 2, since the matrix is a 16x1 matrix, there will be 16 
bits in each control word; for an NxN, or NxM matrix 
(where M is the depth of the matrix), there will be NxN or 
NxM bits in the control words, respectively. In the example 
of FIG. 2, the control word for the first data block, block Bl, 25 
is 1110100000000000. This indicates to the decompression 
program that the first three bits (the first three “ones”) will 
contain a data bit to be decompressed or reconstructed, the 
fourth bit (designated by the “0”), will contain a place 
holder, the fifth bit (“1”) will contain a reconstructed data 30 
point, and the remaining bits (all zeroes) will be given place 
holders. 

At step 110, the control word created in step 108 can be 
encoded using lossless compression techniques (an example 
of which is given in more detail below). By compressing the 35 
control word the compression ratio can be significantly 
increased without any additional data loss. 

At step 112, a determination is made as to whether or not 
there are additional data blocks to be processed. Since the 
data is processed on the fly, data blocks may be continuously 40 
accumulating. If there are additional data blocks to be 
processed at this time, the process proceeds back to step 102, 
for processing of the next data block. If not, the process 
proceeds to step 114. 

At step 114, the compressed data can be transmitted with 45 
its control word so that, upon receipt, decompression can 
take place by, for example, applying the inverse transform as 
described below. 

Thus, as noted above, the Chebyshev algorithm is based 
on three parameters: first, block size, which is the number of 50 
samples used per iteration of the compression method; 
second, the threshold level, the minimum value of the 
coefficients to be retained; and third, the number of bits, 
which is the number of bits to which each coefficient is 
quantized. By varying the parameters for different types of 55 
data, good compression ratios can be achieved with minimal 
error. Higher threshold values always result in better com- 
pression ratios at the expense of reconstructed signal quality. 
Increasing the number of bits generally decreases compres- 
sion ratio due to the additional bits stored, but gives better 60 
reconstructed signal quality. Increasing the block size has 
more varied results, but a large block size generally 
increases the compression ration because not as many block 
maxima and minima are being stored. 

Following is an example of a preferred embodiment of the 65 
present invention used in connection with two-dimensional 
data (e.g., image data). 


EXAMPLE 

Algorithm Outline 

Onboard the spacecraft, divide the image into square 
blocks. To each block: 

Apply the Chebyshev transform 

Eliminate the low-amplitude transform coefficients 

Quantize the retained coefficients 

Encode and transmit the retained coefficients 

On the ground, decode and apply block artifact reduction 
algorithm (optional). 

Onboard Operations 

Step 1: Block Processing 

Divide the image into NxN blocks (N=8 in this example). 
Each block will then have the form 


where f^ is the xf h pixel in the block. 

This will require buffering N rows of the image before the 
first block can be processed. 

If the image size cannot be divided evenly by N, up to 
N-l rows and up to N-l columns need to be padded with 
the adjacent pixel value. 

Step 2: Transform 

Apply the Chebyshev transform to each block: 


C ‘j ~ A/2 /E /E fmkTimTjk 




i, j= 1, ... , N. 


c jj is the ij* Chebyshev transform coefficient and T tJ - is a 
cosine lookup table stored onboard the spacecraft. 

The transform can also be written in matrix form, 


4 _ 
= w^ TFp - 


Step 3: Thresholding 

Set all coefficients, c fJ , whose absolute value is less than 
a commanded threshold value to zero. The exception is the 
first coefficient, c n , which has the highest amplitude and is 
always retained. The retained coefficients are defined by 


( cij, i = j = l 

c'jj = < Cij, \cij\ > threshold, i, j = l, , N 
{ 0, otherwise 
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This results in a matrix whose elements are primarily 
zero. Only the nonzero coefficients are quantized, encoded, 
and transmitted. 

Step 4: Quantization 

Quantize the retained (non-zero) coefficients by rounding 5 
the mantissa of the binary representation to P bits (sign bit 
included), and storing only Q positive bits in the binary 
exponent (because the threshold value will always be greater 
than 1.0, the need for negative exponents is eliminated, 
saving one bit per retained coefficient). to 

The specific steps are: 

1) Convert each coefficient to its binary representation, 
retaining the sign bit (0 positive, 1 negative) 

2) Shift the radix point to the leading 1 and increment the 

exponent (normalization). 15 

3) Retain the sign bit and P-1 bits beyond the radix point, 
rounding if the next bit is a 1 . Do not store the leading 1 
in the mantissa. It is assumed and will save us 1 bit. 

4) Retain the exponent as a Q-bit value. 

5) Convert the Q exponent bits and the P mantissa bits (sign 20 
bit leading) to a decimal integer. 

For example, quantize the floating-point value 673.4375 
to 8 bits using a mantissa of P=4 bits and an exponent of Q=4 
bits. Following the steps above, 

1) The binary representation is 0 10101000001.0111 (the 25 
sign bit leads) 

2) Normalized: 0 1.01010000010111x2 9 

3) Retain the 4 bits 0 Oil in the mantissa (note that 1.0101 

was rounded to obtain 1.011, and the leading 1 was not 
retained). 30 

4) Retain the exponent 9 (1001 in Q=4-bit binary) 

5) Convert the exponent bits followed by the sign bit and the 
mantissa bits (1001 0 Oil) to an integer: 147 

The leading 1 in the normalization is assumed and must 
be replaced when converting back to floating point. The 35 
restored floating-point value is then +1.01 lx2 9 , or 704. 

For 12-bit planetary images compressed using 8x8 
blocks, Q=4 bits are stored in the exponent of all thresholded 
Chebyshev coefficients. The mantissa of coefficient c n is 
rounded to P=7 bits (6 bits plus sign bit), and the mantissa 40 
of all other coefficients is rounded to P=4 bits (3 bits plus 
sign bit). 

By placing the exponent in the most significant bits of the 
quantized coefficient and the mantissa in the least significant 
bits, more efficient lossless encoding (step 8) is achieved. 45 
For the lossless encoding step, the quantized values are 
treated as integers. 

Step 5: Control Word 

A ‘control word’ stores the original matrix location of 
each coefficient in a block and must be transmitted along 50 
with the retained coefficients. 

The control word cw for any given block is defined by 
mapping each coefficient to either 0 (if it set to zero after 
thresholding) or to 1 : 


( 0 , 4 = 0 

cw = < 

[ 1, otherwise 

60 

Step 6: Zigzag Scan 

After thresholding, most of the non-zero coefficients will 
be clustered in the upper left hand comer of the coefficient 
matrix. Map both the coefficient matrix and the control word 
matrix to a vector using a zigzag scan. This enables more 65 
efficient run-length encoding of the control word cw and 
coefficient matrix c\j. 


For an 8x8 matrix, the zigzag mapping is defined by 


cn 

c 12 ■ 

■ <?18 

C2l 

Cll ■' 

■■ c 2 & 

csi 

^82 ■ 

■■ <?88 


kllCi2C2iC3 1 C22Ci3C 1 4C23C32C4 1 C 51 C42C33C24C 1 5C 16 C25C34C43C52C 61 C7 1 C 6 2 
^ 53 ^ 44 ^ 35^26 <^ 17 ^ 18 ^ 27 ^ 36 ^ 45 ^ 54 ^ 63 ^ 72 ^ 81 ^ 82 ^ 73 ^ 64 ^ 55 ^ 46 ^ 37 ^ 28^38 

^47 <- 56^65 ^ 74^83 ^ 84 ^ 75 ^ 66 ^ 57 ^ 48^58 ^67 ^ 76^85 ^ 86^77 ^68 ^78 ^ 87^-88 ]■ 


For example, the thresholded coefficient matrix, 



is mapped to the vector 

t C ll C 12 C 21 C 31 c 22 0 c i 4 c 23 0 0 0 c 42 0 0 c 15 0 0 0 ... 0], 
and the corresponding control word, 



is mapped to 

[1 1 1 1 1 0 1 1 0 0 0 1 0 0 1 0 0 0 ... 0 ]. 

Step 7: Control Word Encoding 

To run-length encode the control word vector, it is trun- 
cated after the last 1 and a length specifier telling how many 
bits are in the truncated control word is transmitted. Two 
extra bits are saved by eliminating the leading and trailing 1 s 
of the truncated control word vector (the control word vector 
always begins with 1 because c n is always retained, and it 
always ends in 1 by definition of the tmncation). The length 
specifier contains log 2 (N 2 ) bits (6 bits for 8x8 blocks). 

In the example above, the transmitted coefficients are 
[c n c 12 c 21 c 31 c22 c 14 c 23 c 42 cl5 ]. 

The truncated control word vector is 

[1 1 1 1 1 0 1 1 000 1 00 1 ]. 

After eliminating the leading and trailing 1 s, the control 
word vector becomes 

[1 1 1 1 0 1 1 0 0 0 1 0 0 ]. 

The control word is now 13 bits in length, so the length 
specifier (in 6-bit binary) is 
[001101]. 



US 7,249,153 B2 


10 

EXAMPLE 


9 

The length specifier and control word are transmitted 
along with the quantized coefficients. 

The exception to the above-described encoding scheme is 
when c n is the only coefficient retained. In that case, the 
64-bit control word vector is 5 

[1 0 0 0 ... 0]. 

After truncation and elimination of the leading 1 (which 
is also the trailing 1), 0 bits are retained in the control word 
vector, so 0 bits are transmitted. This special case will be 
encoded with a length specifier of all Is, 10 

[ 111111 ]. 

When the decoder sees a length specifier containing all Is, 
it will not read a control word and will know that exactly one 
coefficient was retained in that block, namely c n . 

Step 8: Coefficient Encoding (Lossless) 

The quantized coefficients (step 4) are losslessly encoded 
using DPCM followed by Rice encoding. This is achieved 
by first grouping together M blocks of the image (M to be 
determined by transmission packet size and desired com- 20 
pression ratio — the larger the compression ratio and packet 
size, the larger M). The M blocks of data are processed as in 
steps 1 -7, quantized coefficients are grouped together based 
on their block index in preparation for loseless compression. 
The order of the indices is chosen according to the zig-zag 25 
scan described above. 

Hie quantized coefficients c"C(liii, j = 8, l^m=M), in 
the M blocks 


30 



are grouped as follows: 

The M 11 -bit coefficients: 

Ic 1 c 2 c M 1 

l> li*- li • • • c li J 

The variable number of 8-bit coefficients: 


(Note that many of the c m iy are set to zero by thresholding 
and are not transmitted.) 

DPCM encoding followed by Rice compression is then 
applied first to the stream of 11 -bit coeficients, and then to 
the stream of 8-bit coefficients. This coefficient ordering 
ensures that the difference between successive coefficients is 
small enough to make DPCM followed by Rice encoding 
more efficient than had the coefficients been grouped in 
some other order. 


Step 9: Transmission 60 

Each group of M blocks is transmitted in one packet. 
Transmitted first is a packet header. The header might 
contain information about the location in the image of the 
starting block, and the number M of blocks in the packet. 
Next to be transmitted are the M 6-bit control word length 65 
specifiers, then the M truncated control words, then the 
losslessly encoded coefficients. 


Transmission of M=3 8x8 blocks: 

Block 1: 

After thresholding all coefficients except for c 11 =10884.0 
and c 21 =- 111 .085(are zeroed. 

The quantized coefficients are then 
[c n c 21 M10880 -112], 

or, after converting to binary according to step 4, 

[1101 0 010101 0110 1 110], 
which when converted to integers are 
[1685 110]. 

The control word vector is 
[1 0 1 0 0 0 ... 0]. 

After truncation and removal of leading and trailing Is, 
the control word vector becomes 
[ 0 ], 

and the corresponding length specifier is 1, or 

[00000 1 ]. 

Block 2: 

After thresholding all coefficients except for 
c u =12019.67, c 12 =-347.118, and c 21 =89.045 are zeroed. 
The quantized coefficients are then 
[c n c 12 c 21 ]=[l 2032-352 88], 
or, after converting to binary according to step 4, 

[1101 0 011110 1000 1 Oil 0110 0 011], 
which when converted to an integers are 
[1694 139 99]. 

The control word vector is 
[1 1 1 0 0 0 ... 0 ]. 

After truncation and removal of leading and trailing Is, 
the control word vector becomes 
[ 1 ], 

and the corresponding length specifier is 1, or 

[00000 1 ]. 

Block 3: 

After thresholding all coefficients except for c n =7932.9 
are zeroed. 

The quantized coefficients are then 
[c n] =[7936], 

or, after converting to binary according to step 4, 

[1100 0 111100], 

which when converted to an integer is 
[1596]. 

The control word vector is 
[1 0000 0. . .0]. 

This is the special case where no bits are transmitted for 
the control word and the corresponding length specifier 
contains all Is, or 
[111111]. 

DPCM and Rice encoding is then applied first to the c n 
coefficients (which are 1 1 -bit values): 

[1685 1694 1596], 

then to the remaining coefficients (which are all 8-bit values) 
in the order described in step 8: 

[139 110 99]. 

This lossless compression results in some binary stream 
of values: 

[(c n binary stream) (remaining coefficient binary 
stream)]. 

Without the header, the transmission stream for this 
packet containing M=3 blocks is then 
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If we label the edges of a given block B as follows 


0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 
Block 1 Block 2 Block 3 

6-bit length specifiers 

0 , 1 , - . [binary stream] 

Block 1 Block 2 Block 3 lossless ty 

control word compressed 

transform 
coefficients 


Ground Operations 
Decoding 

Decoding occurs on a block-by -block basis. The coeffi- 
cients are read from the transmission stream and decoded. 
The decompression is achieved by applying the inverse 
Chebyshev transform to the reconstructed coefficient matrix, 


Ei 



e 3 


10 with E x representing the top row of pixels in the block, E 2 
the right-most column of pixels, E 3 the bottom row, and E 4 
the left-most column, then for each block B we can write 
2)-5) as 

hE t =a+bJ^.+cY^.+d(X*X) E .+e(Y*Y) Ei , 

where i=l, 2, 3, 4 and 6E Z is the average of the difference of 
the means of adjacent block edges. 

For example, if we have the following configuration of 
blocks 


j 1 / N N \ N N 

gij = 4 c n ~ 2 2 Tkj + ^ CimTmi + CkmTk J Tmi 


k = 1 m= 1 


where 


x(i 


-H) 


i, j= 1, ... ,N, 



Bi 


B 2 

b 3 

b 4 


~ 



then, for block B 3 , 


and gy are the decompressed pixels. 

Block Artifact Reduction (Optional) 

Block artifacts are jumps in the pixel values from one 
block edge to the adjacent block edge, and are more apparent 
in highly compressed images. They can be reduced without 35 
losing any image detail by adding a gradient correction 
matrix V B to each block B k7 where k is the number of blocks 
in the image. The correction matrix adjusts the pixel values 
such that the adjacent edges of any two blocks have the same 4Q 
means. 

The correction matrix for a block is defined by 


6Ei 





1 

N 

' N N V 

*.7=1 7=1 ). 

1 

N 

' n n y 


J= l 7=1 ). 

1 1 
N \ 

f N N V 



i>i jj 

1 1 
N \ 

f N N V 

0=i j = i ). 


P B =a+bX+cY+dX*X+eY*Y, k=l, , number of 
blocks 


where 


45 where g is the \) th reconstructed pixel in block B^. 
Equations l)-5) can be written as the matrix equation 



0 1 

... T 


0 

0 ■ 

■■ O' 

x = 

0 1 

... 7 

, Y = 

1 

1 ■ 

■■ 1 


.0 1 

... 7 


7 

7 ■ 

■■ v. 


1 

3.5 

3.5 

17.5 

17.5 

' a ' 


' 0 ' 

1 

3.5 

0 

17.5 

0 

b 


6Ey 

1 

7 

3.5 

49 

17.5 

c 

= 

6 E 2 

1 

3.5 

7 

17.5 

49 

d 


6E 3 

.1 

0 

3.5 

0 

17.5. 

_ e 


_6E 4 _ 


55 

and * represents the element-by-element product of two 
matrices. 

The correction coefficients {a,b,c,d,e} can be found by 
solving a system of equations formed by applying 5 condi- 
tions on the block: 

1) The average value of V Bk over the block B k is zero: 

0 =P Bk =a+bX+cY+dX*X+e 

Y*Y=a+3.5b+3.5c+17.5d+17.5e 

2) - 5) The mean value of each edge of the block matches the 
mean of the adjacent block’s edge. 


and can be solved for {a,b,c,d,e} by taking the inverse of the 
matrix. Using {a,b,c,d,e}, the correction matrix V Bk can be 
calculated. The gradient-corrected pixel value gC for block 
B^ is then 

gv=g'y+PB k where i, j=\, N. 

If an edge of a given block lies on the border of the image, 
5E Z - for that edge is set to zero. 

Chebyshev approximation is well-known; however, for 
65 the purpose of simplifying the explanation of the above- 
described inventive process, it is described in more detail 
below. 
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The Chebyshev polynomials of the first kind are defined 
by T w (x)=cos(n arccos x). These are orthogonal polynomials 
of degree n on the interval -1 =x^i 1, with the weight 1/ 

7 1-x 2 . They satisfy the continuous orthogonality relation 

i + j Equation (1) 

i = j* 0 

i = j = 0. 10 


i: 


TiWTjW 

'-1 VT^ 


dx = 


The polynomial T„(x) has n zeros on the interval [-1, 1], 
at 


15 



Equation (2) 


20 

for k=l, 2, . . . , n. When T m (x) is evaluated at its m zeros 
x k (k-1, . . . , m) given by (2), the polynomials of degree i, 
j<m also satisfy the discrete orthogonality relation 

25 

( 0, i + j Equation (3) 

Y J T,(x k )Tj(x t ) = \ml 2, i = j , to 


The Chebyshev approximation of order N to a function 
,f(x) is defined by an expansion in terms of Chebyshev 
polynomials, 

35 


/(*)* 


TV- 1 


^ c kT k (x) 
_k= 0 


Co 
2 ’ 


Equation (4) 


where the N coefficients are given by 


40 


c j 


2 A 

= j;2jf(Xk)Tj(x k ) 


Equation (5) 

45 


for j=0, . . . , N-l, and x k are as before the zeros of T^x). 

The Chebyshev approximation is exact on the N zeros of 50 
the polynomial T^x) in [-1, 1]. A suitable transformation 
from Equation (2) enables the N zeroes to give equal 
intervals along the x-axis. Furthermore, the Chebyshev 
approximation has an equal error property, whereby the error 
of the approximation is distributed almost uniformly over 55 
the fitting interval, so it is an excellent approximation to the 
so-called min-max polynomial which has the least value of 
the maximum deviation from the true function in the fitting 
interval. Most important, the Chebyshev approximation can 
be truncated to a polynomial of much lower degree that 60 
retains the equal error property. These mathematical prop- 
erties of the Chebyshev polynomials are key to their use- 
fulness for approximation and interpolation, as well as for 
compression of time series data. 

The Chebyshev compression algorithm is a form of 65 
transform encoding applied to data blocks. This category of 
algorithm takes a block of data (1- or 2-dimensional) and 


performs a unitary transform, quantizes the resulting coef- 
ficients, and transmits them. Many types of transforms have 
been used for image compression, including Fourier series 
(FFT), Discrete Cosine (DCT), and wavelet transforms. 

The Chebyshev polynomials are related to the DCT as 
shown in Equation (2). The Chebyshev approximation, 
because of the equal error and min-max properties, is 
particularly suitable for compression of time-series data, 
independent of correlations. 

The actual implementation of the Chebyshev approxima- 
tion to achieve the present invention is numerically simple 
and the computational burden is low as shown by Equation 
(5). It is necessary only to calculate linear combinations of 
data samples with a small number of known, fixed coeffi- 
cients. Because the coefficients T 7 (x^) given in Equation (5) 
are known, they need not be calculated by the processor, but 
rather are stored as a table look-up or loaded into memory 
in advance. 

The Chebyshev approximation in accordance with the 
present invention can be programmed using the high level 
language IDL, although other languages can also be used. 
By way of example, an IDL-language embodiment is 
described. A compression routine was written in which three 
parameters could be modified to evaluate the technique. 
These included the following: 

Block Size The number of samples of the serial data 
stream during one iteration of the compression method. The 
routine continues to process “blocks” of the raw data until 
the entire data set has been compressed. 

Threshold An adjustable parameter to balance high com- 
pression factors without excessive loss of precision. The 
threshold value will determine the number of coefficients 
retained within a block of the serial data stream. 

Bits The number of bits returned from each retained 
Chebyshev coefficient. 

To provide some insight into the Chebyshev method of the 
present invention, consider a serial data set of N measure- 
ments taken in blocks of m samples. Initially, coefficients for 
all m samples within a block are calculated. Because the 
Chebyshev approximation is exact on the m zeros of the 
polynomial, retaining all of these m coefficients preserves 
the original data within the block exactly (within round-off 
errors). By thresholding the coefficients, it is possible to 
reduce the number of coefficients needed to reconstruct the 
original m samples with sufficient accuracy to be scientifi- 
cally useful. However, it is necessary to record which of the 
m coefficients within a block have been retained (as well as 
the coefficients themselves) in order to reconstruct the data 
accurately. It is more efficient to use larger block sizes, but 
there is a trade-off with accuracy. 

For JPEG or similar algorithms, a variance analysis is 
done on a representative data set and the bit allocation is 
fixed for a particular type of data. A key advantage of the 
Chebyshev approximation of the present invention is that 
simple thresholding of the coefficients can be computed in 
real time on the actual data. This thresholding technique 
accomplishes the same purpose of the variance analysis in 
other algorithms because of the equal error and min-max 
properties of the Chebyshev polynomials. 

The Chebyshev compression method of the present inven- 
tion differs from standard lossy-compression techniques 
such as the Graphics Interchange Format (GIF) or the Joint 
Photographic Experts Group (JPEG) or the Discrete Cosine 
Transform (DCT) in that the Chebyshev technique does not 
rely on 2 -dimensional correlations in a data set. In fact, one 
of the advantages to the block encoding performed by this 
method is that it works extremely well on serial (1 -dimen- 



US 7,249,153 B2 


16 


15 

sional) data. The min-max and equal error properties ensure 
high fidelity in the reconstructed data with absolute errors 
roughly independent of the signal strength. This has the 
consequence that the data compression tends to suppress 
noise but still reproduces the high signal-to -noise peaks in 5 
the data with maximum percentage accuracy. Applicants are 
aware of no comparable compression techniques that are as 
computationally simple and that work as well on time-series 
data. These data sets are typical of instruments used in many 
areas of space science (for example, hyperspectral data, 10 
spectra, particle instruments, magnetometers, etc.). 

The above-described steps can be implemented using 
standard well-known programming techniques. The novelty 
of the above-described embodiment lies not in the specific 15 
programming techniques but in the use of the steps 
described to achieve the described results. Software pro- 
gramming code which embodies the present invention is 
typically stored in permanent storage of some type, such as 
permanent storage of a processor located on board a space- 20 
craft. In a client/server environment, such software program- 
ming code may be stored with storage associated with a 
server. The software programming code may be embodied 
on any of a variety of known media for use with a data 
processing system, such as a diskette, or hard drive, or 25 
CD-ROM. The code may be distributed on such media, or 
may be distributed to users from the memory or storage of 
one computer system over a network of some type to other 
computer systems for use by users of such other systems. 3Q 
The techniques and methods for embodying software pro- 
gram code on physical media and/or distributing software 
code via networks are well known and will not be further 
discussed herein. 

It will be understood that each element of the illustrations, 35 
and combinations of elements in the illustrations, can be 
implemented by general and/or special purpose hardware- 
based systems that perform the specified functions or steps, 
or by combinations of general and/or special-purpose hard- 
ware and computer instructions. 40 

These program instructions may be provided to a proces- 
sor to produce a machine, such that the instructions that 
execute on the processor create means for implementing the 
functions specified in the illustrations. The computer pro- 45 
gram instructions may be executed by a processor to cause 
a series of operational steps to be performed by the processor 
to produce a computer-implemented process such that the 
instructions that execute on the processor provide steps for 
implementing the function specified in the illustrations. 50 
Accordingly, the figures herein support combinations of 
means for performing the specified functions, combinations 
of steps for performing the specified functions, and program 
instruction means for performing the specified functions. 

While there has been described herein the principles of 55 
the invention, it is to be understood by those skilled in the 
art that this description is made only by way of example and 
not as a limitation to the scope of the invention. For 
example, while specific implementations in one and two- 
dimensional applications are described in detail herein, 60 
three-dimensional data can be compressed using the same 
inventive method and specific implementations for doing so 
are intended to be covered by the present claims and will be 
readily apparent to artisans of ordinary skill. Accordingly, it 
is intended by the appended claims, to cover all modifica- 65 
tions of the invention which fall within the true spirit and 
scope of the invention. 


Invention claimed is: 

1. A method of compressing data, comprising the step of 
approximating said data using Chebyshev polynomials, fur- 
ther comprising the step of: 

dividing said data into data blocks of a predetermined 
size, to form matrices corresponding to each data 
block; 

transforming the data in each matrix using Chebyshev 
polynomials to form corresponding matrices of Che- 
byshev coefficients; and 

creating compressed data using the Chebyshev coeffi- 
cients. 

2. The method of claim 1, said creating compressed data 
using the Chebyshev coefficients further comprising the step 
of: 

thresholding the Chebyshev coefficients in each matrix to 
retain in each matrix only Chebyshev coefficients that 
are of a predetermined value. 

3. The method of claim 2, further comprising the step of: 
quantizing said Chebyshev coefficient matrices to create a 

compressed data block corresponding to each of said 
data blocks. 

4. The method of claim 3, further comprising the step of: 
creating control words for each of said compressed data 

blocks, said control enabling decompression of said 
compressed data blocks in proper sequence. 

5. The method of claim 4, wherein said quantizing step 
comprises at least the step of: 

performing floating point quantization on said Chebyshev 
coefficient matrices. 

6. The method of claim 4, wherein said quantizing step 
comprises at least the step of: 

performing inverse hyperbolic sine compander quantiza- 
tion on said Chebyshev coefficient matrices. 

7. The method of claim 4, further comprising the step of: 
losslessly compressing said control words. 

8. The method of claim 7, further comprising the steps of: 
transmitting said compressed data blocks and said com- 
pressed control words to a receiver; 

decoding said compressed control words and compressed 
data blocks; and 

performing block artifact reduction on said decoded data 
blocks. 

9. The method of claim 1, wherein said data comprises 
time-series data. 

10. A hardware system of compressing data, comprising 
means for approximating said data using Chebyshev poly- 
nomials, further comprising: 

means for dividing said data into data blocks of a prede- 
termined size, to form matrices corresponding to each 
data block; 

means for transforming the data in each matrix using 
Chebyshev polynomials to form corresponding matri- 
ces of Chebyshev coefficients; and 
means for creating compressed data using the Chebyshev 
coefficients. 

11. The system of claim 10, said means for creating 
compressed data using the Chebyshev coefficients further 
comprising: 

means for thresholding the Chebyshev coefficients in each 
matrix to retain in each matrix only Chebyshev coef- 
ficients that are of a predetermined value. 

12. The system of claim 11, further comprising: 
means for quantizing said Chebyshev coefficient matrices 

to create a compressed data block corresponding to 
each of said data blocks. 
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13. The system of claim 12, further comprising: 

means for creating control words for each of said com- 
pressed data blocks, said control enabling decompres- 
sion of said compressed data blocks in proper 
sequence. 5 

14. The system of claim 13, wherein said means for 
quantizing comprises: 

means for performing floating point quantization on said 
Chebyshev coefficient matrices. 

15. The system of claim 13, wherein said means for to 
quantizing comprises: 

means for performing inverse hyperbolic sine compander 
quantization on said Chebyshev coefficient matrices. 

16. The system of claim 13, further comprising: 

means for losslessly compressing said control words. 15 

17. The system of claim 16, further comprising: 
means for transmitting said compressed data blocks and 

said compressed control words to a receiver; 
means for decoding said compressed control words and 
compressed data blocks; and 20 

means for performing block artifact reduction on said 
decoded data blocks. 

18. The system of claim 10, wherein said data comprises 
time-series data. 

19. A computer program product recorded on computer 25 
readable storge medium for compressing data, comprising 
computer readable means for approximating said data using 
Chebyshev polynomials, further comprising: 

computer readable means for dividing said data into data 
blocks of a predetermined size, to form matrices cor- 30 
responding to each data block; 
computer readable means for transforming the data in 
each matrix using Chebyshev polynomials to form 
corresponding matrices of Chebyshev coefficients; and 
computer readable means for creating compressed data 35 
using the Chebyshev coefficients. 

20. The computer program product of claim 19, said 
computer readable means for creating compressed data 
using the Chebyshev coefficients further comprising: 

computer readable means for thresholding the Chebyshev 40 
coefficients in each matrix to retain in each matrix only 
Chebyshev coefficients that are of a predetermined 
value. 
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21. The computer program product of claim 20, further 
comprising: 

computer readable means for quantizing said Chebyshev 
coefficient matrices to create a compressed data block 
corresponding to each of said data blocks. 

22. The computer program product of claim 21, further 
comprising: 

computer readable means for creating control words for 
each of said compressed data blocks, said control 
enabling decompression of said compressed data 
blocks in proper sequence. 

23. The computer program product of claim 22, wherein 
said computer readable means for quantizing comprises: 

computer readable means for performing floating point 
quantization on said Chebyshev coefficient matrices. 

24. The computer program product of claim 22, wherein 
said computer readable means for quantizing comprises: 

computer readable means for performing inverse hyper- 
bolic sine compander quantization on said Chebyshev 
coefficient matrices. 

25. The computer program product of claim 22, further 
comprising: 

computer readable means for losslessly compressing said 
control words. 

26. The computer program product of claim 25, further 
comprising: 

computer readable means for transmitting said com- 
pressed data blocks and said compressed control words 
to a receiver; 

computer readable means for decoding said compressed 
control words and compressed data blocks; and 

computer readable means for performing block artifact 
reduction on said decoded data blocks. 

27. The computer program product of claim 19, wherein 
said data comprises time- series data. 



